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ABSTRACT 


A two-person search/ambush game is considered where each player wants to 
maximize his survival time while minimizing the survival time of his adversary. This is 
done in the context of convoy routing where each player can choose which route they 
take. Their estimated survival times depend upon (a) if their adversary is directly 
searching on that route, (b) the indirect probability of detection or hazard if their 
adversary is not along that route, and (c) the risk involved with moving from route to 
route. It is possible for a player to be interdicted even if his adversary is not on that 
route. Each player has a payoff matrix that maximizes their expected time to capture. 
We show that both payoff matrices can be evaluated as a bimatrix game that yields 
optimal mixed Nash Equilibria through the use of non-linear programming. The results 


of this evaluation can be used to optimally conduct route clearing and convoy routing. 
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I. INTRODUCTION 


A. BACKGROUND 


Military convoys, also referred to as Combat Logistic Patrols (CLPs), have 
always been lucrative targets for adversaries that do not have the military advantage to 
sustain and win in direct conflict. The safety of a convoy is often seen as a function of 
active measures employed by the convoy such as speed, increased aggressiveness and 
patrolling. To this end, Matthew Hakola used agent based modeling and principal 
component analysis to investigate which variables carry more weight in the success of the 
convoy [1]. | However, when faced with multiple routes a commander must make 
decisions based on the threat associated with each route before choosing which one to 
take. William Ruckle’s geometric approach to the game theory behind a hunter and prey 


has been useful in understanding the foundation of our game [2]. 


Our convoy, which we will name Blue, wishes to traverse an area but faces the 
threat of being intercepted between the start and finish of his route by an enemy lying in 
wait. We will call the enemy Red. Using Ruckle’s example, we will assume our area is a 
unit square, with a horizontal x-axis and vertical y-axis. We will assume that neither Red 
nor Blue receive any intelligence on the position of the other once Blue starts to move. 
Because of this limitation Red gains no advantage from changing locations while Blue is 
in motion and so we will consider Red’s position fixed. Blue simply wishes to get from x 
=0tox=1. With this in mind we can disregard Red’s x-coordinate, as it does not affect 
his ability to intercept Blue. Instead, we focus on Red’s ability to intercept Blue as a 
function of his y-coordinate and range of his weapon, which we will define as r. 
Therefore, our unit box can now be decomposed into routes of 1/r width. Blue gains no 
advantage by changing routes while traversing through the area but can change routes 
with each new crossing. Using this logic, we limit our game to the use of individual 


routes versus a network of routes. 


The limitation with most ambush models is that the threat is always one sided. 
This is evident in Ruckle’s examples [2], as well as in textbook examples on game theory 
[6]. In reality, an ambusher must also contend with the fact that he may be discovered 
before he gets the opportunity to conduct an ambush. Furthermore, the threat is not 
always from an active searcher (also referred to as direct detection). Rewards and 
humanitarian assistance can influence an area to be more pro-active in uncovering and 
turning in cells that are planning ambushes (also referred to as indirect detection). So, the 
game becomes more complex with each side trying to maximize their own survival time 


while minimizing their opponent’s time considering both indirect and direct detection. 


We approach this model using a “deductive” search methodology following work 
done by Owen and McCormick [3]. This search methodology focuses on determining 
those routes most favorable to each player rather than trying to follow any “trail” left by 
their presence. Their algorithm provides us with an expected time to capture / ambush 
for both players. This time is dependent on the probability of both direct and indirect 
detection as well as the ability to successfully change routes. Then, given the expected 
time to capture / ambush for each player along each route, we use non-linear 
programming to determine the optimal strategy for each player to adopt in order to 


maximize individual survival time. 


B. RESEARCH GOAL 


The goal of this paper is to provide a way to investigate and determine the mixed 
strategies used by both the convoys and adversaries along a relatively small number of 
routes. This will allow commanders to make informed decisions on which routes to take 
to maximize the survival rate of CLPs as well as which routes are most likely to be 


ambush sites. 


GF BASIC OVERALL MODEL 


The approach we will use is that of a bimatrix game with the goal of finding 
equilibrium points. Our players are defined as Blue, for the convoy, and Red, for the 
ambusher. Each player has their own payoff matrix where their goal is to maximize their 


survival time while their adversary simultaneously tries to minimize it. We define W as 
2 


the payoff matrix for the Search Model where Red wishes to maximize his survival time 
while Blue searches for him. Our other payoff matrix, A, represents the Ambush Model 
where Blue wishes to maximize his survival time while Red attempts to ambush him. 
These two payoff matrices are then combined into a bi-matrix model (, A) that 
represents the competing goals for each player (i.e., maximize individual survival time 


while minimizing the opponents). 


The Search and Ambush payoff matrices are constructed by adopting a model 
developed by Owen and McCormick [3]. Their manhunting model considers a fugitive 
who faces not only the threat of being captured directly but also the threat of being turned 
in by the local populace [3]. The Search Model adapts this directly for Red’s threat of 
being found through Blue’s direct search and Blue’s efforts to uncover him indirectly 
(i.e., through the actions of others). The Ambush Model takes into account the threat 
Blue faces from a direct ambush by Red as well as the indirect hazards that may prevent a 
convoy from being completed (terrain, length of route, etc.). Each route presents four 


initial probabilities: 


1) The probability Red is detected indirectly by a third party (indirect detection 
(q)) 


2) The probability Red is detected directly by Blue (direct detection (p)) 


3) The probability Blue fails complete the convoy because of reasons other than 


an ambush (indirect hazard (r)) 


4) The probability Blue fails to complete the convoy because he was ambushed 


by Red (direct hazard (s)) 


The rate by which each initial probability increases is dependent upon the 
intensity of the efforts of the adversary. The intensity of each player’s efforts to 
minimize his opponent’s survival time is represented by the variables a, B, and y. The 
amount of money and resources Blue spends on gaining the assistance of the local 
populace to uncover Red affects how quickly the threat of indirect detection increases. 
This rate of increase is represented by @ and can be seen as quantifying the 


aggressiveness of Blue’s efforts to indirectly detect Red. The intensity of a direct search 
3 


for Red (possibly as a function of the numbers of Soldiers, sensors, etc. involved) 
determines how quickly the threat of direct detection increases along a certain route and 
is represented by B, or the aggressiveness of Blue’s efforts to directly uncover Red. We 
assume that the quality of the route (which affects the indirect threat to the convoy) will 
not change with repeated iterations along that route and therefore we limit ourselves to 
only one parameter for threat’s rate of increase in the Ambush model. With every use of 
a route, Blue faces the risk of Red moving onto that route to intercept him on the next 


convoy. The rate at which this threat increases is defined as y—Red’s aggressiveness. 


Once we construct the individual payoff matrices, we construct the bimatrix 
model (‘P, A) where each cell contains the pair of values from the respective individual 
payoff matrices. Using non-linear programming, we determine the optimal route 
selection strategies for each player. We show that these optimal strategies (also known 
as Nash Equilibria) are dependent upon the desired survival time for each player, which 
can be viewed as each player’s decision to be risk averse or risk prone. These optimal 
strategies are then used to determine which routes a convoy should take, as well as which 


routes a patrol can most expect to uncover the enemy. 


I. SEARCH MODEL 


A. OVERVIEW 


We use a model created by Owen and McCormick [3] to determine the expected 
time until capture (or ambush) for each player. For the benefit of the reader, and to add 
our own notation that will be useful later in analysis, we will cover it again in Chapters II 
and III of this thesis. For our purposes, we define “indirect detection” as the threat Red 
faces of being discovered indirectly and “direct detection” as the threat he faces from 


being found directly by Blue. 


1. Indirect Detection 


The local populace of a region can pose a risk to Red and shorten his survival 
time. Blue can take advantage of this in a number of ways. He could offer greater 
rewards for revealing Red or he could gain the confidence of the local population so that 
it is in their best interest to betray Red. We can safely assume that as Red occupies an 
area the risk of discovery increases. Assuming that Blue is looking on another route for 
Red, Red’s probability of detection and capture within f units of time on route 7 will be 


represented by Q(t) and the probability he has not already been captured is1—Q/(t). 
This probability of capture is not static but rather increases with time. We let g,(t) 


represent the rate at which this probability increases along route i while Red is on it. 
Making the additional assumption that Red’s risk is initially zero when he enters the route 


we get the following differential equation and its solution. 


0, (0) = ¢,(1- 2,00), 
(0) =0 
2,(0) a 


Q(t) =1—exp{-G(t)} 


Note that the derivative of Q(t) is the increase in the probability of detection. It 


is the probability that Red has not been detected multiplied by the rate of increase for 


that route. We further define g, as a linear, strictly increasing, unbounded function with 
the initial probability, g,, that Red will be captured on that route plus some linear rate of 
increase, a, 20, that controls how quickly that probability increases with time. Owen 


and McCormick used a constant @=0.0lin their examples as a reasonable rate of 
increase. In the case of our game, we can assign a value slightly larger if Blue is very 
aggressively pursuing indirect means of detection along that route. Clearly, the value can 


change for each route based on Blue’s efforts. G, is the anti-derivative of g, evaluated 
from zero to time ¢. 
This gives us 


g(th=q,+ayt 


t 2 2 
G(0)= J g(s)ds = tg, + ( ) 





Applying (2) to the solution in (1) we obtain 


Q(t) =1—exp{-G,(1)} 
Q, (t)= g (U-O,(0)) = g,(t) exp{—G,(t)} 


Red wants to maximize his survival time, so we can assume that at some time T 


(3) 


he will decide to move, if he has not already been captured. Keep in mind that Q(T) is 
the probability Red is captured by time T, 1-Q(T) is the probability that Red will be able 
to move at time T (i.e., he is not captured). We will assign the random variable X to 
represent the time Red spends in route i before moving. In determining this time, we 


must take into account not only the probability he will be able to move at that time, but 


also the probability he has not been captured before time T given the density Qi) from 0 


to T. This gives us the formula for our expected time Red spends on route i. 


E[X,]=f. 10,Qat+-O,(7))T (4) 


We can simplify this even further using (3) 
T 
ELX|]=J_ t9,(0)exp{-G,()}dt +T exp{-G(T)} (5) 


Red’s move does not come without risk. He can always be detected en route to 
his next ambush location. Independent of when he moves, he expects to survive an 


additional V, units of time after starting the move. Given that the probability that he will 
even get to move is exp{—G.(t)}, we can define his total expected survival time on route 
ias 


A,=["1g,(exp{-G,()}dt +T exp{-G,(T)} + exp-G,(T)}V, (6) 


Clearly Red’s decision to depart is based upon his desire to maximize the time T 


of his departure. To maximize this we differentiate A, with respect to T. 








= =Tg,(T)exp{-G,(T)} + exp{-G,(T)} —Tg,(T) expt-G,()} — 8,(T) expt-G,T)3V; 
When simplified we get 
dA, 
ga OD AY, (7) 


Setting this to zero and solving for T produces the following 
1 
T.=g,'\(— 8 
arf ) (8) 
Since gis a strictly increasing function, we can be assured that 7, is unique. Now 


we notice that we can simplify equation (6) if we integrate by parts letting uv = t and 


dv = g(t)*exp {-G (t)}. 


Note: [ 1g, (exp{-G,(0)}dt =—T exp{-G(T)}+ | exp{-G,()}dt 


So Red’s expected time to indirect detection along route i becomes 


T 
A, =| exp{-G,(N}dt + exp{-G,(T)}Y, (9) 
Keeping in mind how we defined g; we can take equation (8) and rewrite it as 
q, + aT, = = 
Solving for T we get 
1-qV, 
as 10 
i= y (10) 


We can see that 7, has the potential of being a negative number or zero. Since we 
assume that @,,V, are positive, this can only occur when the initial probability g, is 
sufficiently large soV,<q,'. Such a cell would present a significant risk to Red, and 
offer no gain in his expected survival time. Red would immediately move, if he found 
himself on such a route. We will therefore choose T, by (10), if it is positive and set 
T, =0 otherwise. Intuitively this makes sense. If Red were to move into the Green Zone 
in Iraq, his initial probability of detection would be so high that he would immediately 
move to another location. We can also see that by Blue increasing a, he forces Red to 


move more frequently and risk in transit detection more often. 


Going back to (2) and using the integral of G(t) we get the following: 


Z 


exp{-G,(1)} = exp {gf ay 


2 

a; Gix2, Wi 
=exp{-—- (t+ —y +4 —— 
pt 3 | 7, 2a? 


i i 





2 
=exp{ jexpi- 04+ 4} 
2a, 2 a 


i 


Making this substitution into (9) produces the following 





2 
A, =exp(H Hf exp (04+ 4 }ar + exp-G,(MWV, (11) 
20, 8 2 a; 


By letting u =(t+ 4 [a (10) becomes 
a; 


q, 2 U, y) 
A, =exp{23 fo [ exp{-uw°tdt + exp{-G,(T)}V, (12) 
2a, \ a, 


Where our upper and lower limits of integration becoming 





U,,=—_8— and vy =—* 47 5 (13) 


Keeping in mind the error function erf (x) = exp{—y’}dy we can then obtain A, in 


gh 





terms of it. 
2 
exp) WalerfUx)-erf Wu) 
A, = Sb __________ + exp-G.(n}V, (14) 
20 
2. Direct Detection 


We define direct detection as the threat Red faces, if he is located on the same 
route as Blue. In reality, Blue can directly search on multiple routes, but for this model 
Blue’s direct search is limited to the route the convoy is on. Clearly, if Red happens to be 
on the same route as Blue, he will face two risks: 1) the risk of being “given up” and 2) 
the risk of being found by Blue before he can ambush the convoy. The probability that 


Red will be detected in a direct search by time t is given by 


R(t) =1-exp{—-F(t)} (15) 


We make the assumption that Blue’s level of aggressiveness associated with 
directly finding Red will be the same regardless of which route Blue is on. If the physical 
characteristics of the route or other limitations violate this assumption, we will need to 
differentiate this level of aggressiveness as we did in the indirect detection parameter a. 
Keeping this in mind, we define the forcing function for the risk of direct detection as the 


following 


FQ) = p+ ft 
Wheeden BE (16) 
F)=([_ f(s)ds = p+ 
where p is the initial probability that Red will be found, if he is on the same route as 


Blue’s recon and £ (Blue’s direct aggressiveness) is the rate at which that risk increases, 


as long as he remains on the same route as Blue. 


Assuming that Blue and Red are on the same route, we will define the time at 


which Red decides to move as D, with his expected survival time after leaving route i 
still as V,. Red’s total expected survival time, including the time he spends on the same 
route as Blue is therefore given by 

B= [if (exp{-F, jdt + D, exp{-F(D,)} + exp{-F(D,)}V, (17) 


Further refining this, as we did in the case with indirect detection, we obtain the 


following 





Vit exp{? } [erf (W,,) —erf (W,,)] 
28 (18) 
J2B8 


And, as before, our limits of integration become 


ee ee ee p8 19 
Wi 28 an W;; 2p i 2 ( ) 


In the same manner as we derived 7; (10), we get the following result for when 


B, = exp{-F(D,)}V, + 





Red will leave the route he is on, if Blue is directly searching there 


1- pV, 
pees 
BY, 


We can make the reasonable assumption that the indirect probability g will be less 


(20) 


than the direct detection probability p. | Here we make the assumption that the rate of 
increase in a direct search, £, will be equal or greater than the rate of increase in a 
indirect search, @. Put another way, Blue’s patrols will be more aggressive and 
therefore more likely to find Red than any effort to have a third party uncover Red’s 
location. (There may be cases where the inhabitants of an area will be more effective 
than Blue at capturing Red, but then Red would avoid these areas, as they give him no 
advantage.) Since p > q and #2a itis apparent that D, <T, and Red will always depart 


more quickly, if Blue is directly searching on the same route as him. As with 7;, D, has 


the potential of being negative. In that event, we will let D, =0. 
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3. Stochastic Process 

As noted by Owen and McCormick, this is clearly a stochastic game in which 
Red’s survival time is dependent upon how often he is allowed to move. To denote this, 
we will use A” where m represents the number of times Red is allowed to move and i is 
the route that he starts in and the assumption is that Blue is searching on another route. If 
Blue is on the same route as Red, we will represent Red’s expected survival time as 


before with B”. After he moves, Red will be able to move m-/ additional times and he 


expects his remaining survival time to be Ve : 


Considering the case when Red is not allowed to move we letV;" =0. Red will 
stay on the route until he is captured. In this case, T becomes infinite and our upper 
bound on equation (12) goes to infinity along with G(x). This leads 
Erf (©) =1 and lim e * =0 and equation (14) becomes 

2 
exp{ valle Uy) 


—————— 21 
A a (21) 


If we carry this out recursively we see that the general form of (21) becomes 


exp{ “J z[erf U,,)—erf U,,)] 


2a; 








Ar" = +exp{-G(T)}V." (22) 


2a, 


t 


Applying this to the time when Red departs the route we get 
m-| 


7, = max( 47,0) (23) 


Similarly, we derive B” in the following manner 


Va exp oe [erf (W,,) —erf (W,,)] 





Br +exp{-F,(D,)}V,"" (24) 
Pp pt } 
with 
m-1 
D,= max ,0) (25) 
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Finding the expected time until capture along each route, assuming Red is not 
allowed to move (i.e., m = 0), is relatively simple. For m > 0 we must take into 
consideration the time Red expects to gain from moving. This is done by calculating a 
the expected time till detection for the route he is planning to move to by the probability 
that he will successfully complete the move. 


As Red moves from route i to route j, there is the probability , that he will 


complete his move without being captured. (We can think of Pas being a symmetric 
matrix made up of these probabilities based on the distances between routes with zeros 
along the diagonal.) Red will then survive an additional A, or B, when he arrives. If 
Red moves to a route where Blue is not directly searching, he can be expected gain an 
additional 7, units of time if he survives the move 

= pyAy (26) 
Similarly, if Red moves to a route where Blue is directly searching he can expect to gain 


o;, units of time if he survives the move 
0, = 2,8; (27) 
Focusing on a single route J, we can see the expected gain in survival time Red 


may obtain from moving to a different route j. By ordering our 7, in decreasing order 
(le. 7,27, 2...27,) we can use the following algorithm developed by Owen and 


McCormick to determine the increase in survival time after the move, V,, from route i. 








k T. 
k ——l 
—T,-O0, 
iy ad (ae ee (28) 
ae eae ik 
Algorithm. 
As Let k = n (the number of routes) 
2: Let v = vy, computed using (28) 
oy If v<z,, proceed to step 6 
4. Ifv>r,, let k=max {j|z7,>v} 
Di Return to step 2 
6. Compute x* and y* using (29) 
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Solving for (A’, B’) then V,° allows us to solve for (A;, B}), D},7;' then V;'. 


Doing this iterative process allows us to find the expected times until capture for 


increasing values of m, the number of times we allow Red to change routes. 


Owen and McCormick discovered that while computing these quantities for 
increasing values of m their values change very little after approximately m = 10. In their 
paper they proved convergence with the assumption that some risk is incurred every time 


Red moves (i.e. 9, <1). The greater the p,, the faster the expected times converge. Ina 


risky environment, it should not be necessary to compute for values of m greater than 10 
as it is unlikely Red can look more than 5 or 6 steps in advance to see where he should 


move. 


Coincidently, if we were just concerned with Red’s attempt at survival, we can 


compute the optimal strategies for Red’s use of routes and Blue’s strategy for finding 


him. Given that the game has a value vk, we can compute* °- , the optimal strategies 


for Red and Blue respectively, in the following manner 


for 1<i<k,and 0 fork+l<i<n 
(29) 


a tL 
y, =——*— for 1<i<k, and 0 fork+l<i<n 
T,-0; 





A benefit of this model is its ability to determine where Red is most likely to 
move to when he does decide to move. Owen and McCormick further explored this 
property with the assumption that we know the last location Red occupied [3]. We will 
not pursue the same analysis in this model. If we knew that Red moved, and from which 
route, it gives us a route to use without fear of ambush, thus invalidating the need for 
further analysis for route selection. It is worthy to note this property in the event Blue 


does learn of Red’s last position and wishes to send a patrol out to uncover him. 
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B. ANALYSIS OF SEARCH MODEL 

1. Decreasing Expected Time to Detection for a Single Route 

We make the assumption that Red can move multiple times to both increase his 
chances of attacking Blue and to avoid detection. Therefore, the case A’ and B’ do not 
concern us other than to establish the semi-steady state for A’"and B”. Blue’s goal is to 
drive 7, and D.to zero in order to make that route unattractive to Red and safer for the 
convoy. Keeping in mind equations (23) and (25), simply increasing the Blue’s indirect 
aggressiveness (a@,) or direct aggressiveness (f) will decrease Red’s expected time to 
detection but it will not get it to zero. The ultimate goal is to drive the values q,V,""' >1 
and p,V,""'>1. Doing so brings Red’s departure times to zero, and he moves as soon as 
he finds himself on that route. Blue can accomplish this by placing all of his effort onto 
the route with the largest initial probability of indirect or direct detection. Since we are 
dealing with a single route, we will focus on direct detection from here on. Since we are 
dealing with only one route the value then becomes B° (keeping in mind thatV"'=0 and 
letting k=1 for equation (26)) with the additional assumption that Red will be on that 
route to ambush Blue and neither will switch to another route. Blue is then left with 


trying to drive B° to zero. How aggressive must the patrols be (f) to make this happen? 


Taking equation (21) and applying it to B° gives us 





expt, Wa[l-erf (W,)] 
B= B (30) 


PB 


Evaluating (27) shows us that B° can never be zero since {, p, are always positive. The 





only way to reduce the value of the route is to increase Blue’s direct aggressiveness (/) 
and the payoff for this effort decreases exponentially. This is what we have come to 
intuitively understand and, although we are not limiting ourselves to a single route, it can 
help us see our diminishing rate of return on effort along a single route. From this 


foundation we will shift our analysis to multiple routes. 
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2. Decreasing Expected Time to Detection for Multiple Routes 


As with Ruckle’s work on the geometric approach to the game [2], we will 
assume Blue takes a single route to his destination (i.e., straight line). We will also make 
the assumption that the routes are independent of each other (1.e., they do not cross). This 
is important, because if the routes intersected at a common point, then it becomes a game 


with one route at that point. We can then study the routes independent of each other. 
Using the above model, what must Blue do to secure a route? Clearly if he is very 
aggressive (both indirectly and directly (a, and #)) he can then bring all A, and B, close 
to zero, but at the expense of spending greater resources. We have already established 
that they can never be zero because of the exponential nature of the risk. The best Blue 
can do is to drive the time at which Red will depart a route (7, and D,) to zero so that Red 


will move immediately away from that route if he is on it. If he is able to accomplish 


this, then the equations (22) and (24) become: 


A; =exp{-q}V,"" and B”" =exp{—p,}V."" 

Let us first focus on how Blue might get these departure times (7, and D.) to be 
zero. By increasing his indirect aggressiveness(@), Blue can bring expected survival 
time, if Blue is not on the route (A) closer to the expected survival time if Blue was on the 
route (B) and the value of that route goes to B. As the value of route is more dependent 
on A, Blue gets a greater payoff in the reduction of the route’s value by being indirectly 
aggressive(q@) but this does not get Red’s expected survival time any lower than if Blue 


was on that route (B). To reduce Red’s expected survival time more, and ultimately to 
get it to zero, Blue must focus his effort on directly finding Red. This suggests that Blue 


must dramatically increase his direct aggressiveness (f) for this to happen. However, 
Blue has another option. 

As already mentioned the value of a route, in the presence of other routes, (V,”"') 
must be significantly large enough so that gV,"'>1 and/or p,V,"'>1. To do this, Blue 


simply has to add routes under consideration. However, as Blue adds routes and 
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increases V,”"" the value of Red’s expected survival time with Blue on the route (B,”") 


ee , 
goes up. To strike a balance between the two, we find that V,”"' =— provides us the 
q; 


minimum value we need a route to be so as to bring the time when Red moves (D, ) to 


zero (likewise for T. and V."' =—). To minimize the expected survival time for Red, if 
L L q p 


m-l __ 


— and no 
qj 


Blue is on the route (B”), Blue must now adjust his efforts too so that V, 


more. In reality, Blue will want to provide enough “useable” routes where Red will 


move immediately, if he discovers that Blue is on them (1.e., D, = 0) while accepting the 


increase in B” so that he can randomize which route he takes. For the purposes of this 


game, we will only consider those routes that Blue is considering using and is currently 
exerting effort (either indirect or direct) to find Red. (Remember that F (t) and G (t) must 
be strictly increasing). In other words, Blue wants to provide enough viable routes for 
Red to choose from and hide on while reducing the attractiveness of certain routes. This 
may not always be possible, since Blue often has only a finite number of routes to choose 
from and must make the most of what is available. In addition to limited routes, the 
enemy also gets a vote. Red, being an intelligent player, will try to overcome Blue’s 


efforts to avoid him by applying his own ambush model. 
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ll. AMBUSH MODEL 


A. OVERVIEW 


No matter how unattractive Blue makes a route, if he continues to utilize that 
route, Red will be tempted to move onto that route and ambush Blue. Red will risk direct 
detection to achieve his own goals. Blue’s probability of encountering a hazard, either 
indirect or direct, as he continues to utilize the same route convoy after convoy will 
increase. We assume that Red’s efforts to intercept Blue will remain constant regardless 
of route and represent it with the variable y . If this assumption is not valid, we will have 
to differentiate Red’s aggressiveness by route, as we did for a. The more aggressive Red 


becomes the greater the value of 7. 


1. Indirect Hazard 


Every convoy runs the risk of not completing the journey regardless of enemy 
action. This could be the result of treacherous terrain (think of Hannibal’s journey across 
the Alps) or a longer route that can result in more breakdowns. With this in mind, we 


define r, as our initial probability of success for Blue crossing route i. Note that this 


model takes the same form as our model for Red’s threat of indirect detection. As we did 
there, we will start by defining S(t) to be the probability of an unsuccessful completion of 
a convoy. 

S(t) =1-exp{-H,()} (31) 

where our strictly increasing risk is defined by 

h(t)=r+yt 

t : 32 

H,(t)=fhy(s)ds = tr 1 i 

Following the same derivation we have used before, the expected survival time 


for Blue along route i, assuming he can change routes m times, is: 


exp{-}Vzlerf (K,,)—erf (K,,)] 
cn 27 


| ie 


+ exp {-H,(M,)}V,"" (33) 
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where the limits of integration are given by 














I, I, 
K,,=—= and K,, =—4 (34) 
V27 V27 
and the expected time of departure from route i is 
M, res mv, ,0) (35) 
xv, 


2. Direct Hazard 


Different routes provide Red with a greater advantage of ambushing Blue. Some 
routes provide more than adequate cover for Red to hide or choke points for him to 
utilize. Given this condition, we will assign the initial probability of a successful ambush 
to each route as s, and our strictly increasing risk is defined by 


i(t) =s, +yt 


fel fae ~ ye (36) 
L@O=|, j,(z)az = ts, + 5 


This leads us to define our expected time of survival for Blue on route i as 


expt Walerf (L,,)—erf (Ly)] 
Er =— +exp{-J,(N,)}V."" (37) 


| 27 


where the limits of integration are given by 














L,=—& and L, = 5 (38) 


vv a 


and the expected time of departure from route i is 


1 
N, = max( 





—sv, 
iti 0 39 
[ ) (39) 


i 


3. Stochastic Process 


We use the same method as we did for the Search Model in developing the 
expected time to ambush for Blue on each route given that Blue can change routes m 


times. If m = 0, then our expected times for indirect or direct hazard respectively are: 
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expt all —erf (Ly) expt Wall—erf (K,)] 
and E? = 


As with Red, Blue will face some risk by changing routes such as a greater 








(40) 


distance to cover. We will continue to use the terms fau and sigma to represent the 
expected gain in time by moving onto route j, depending on whether Red is present or not 
(respectively), but we will use © to represent our matrix of completion probabilities, as 


we did for P in the previous model. 


If Blue moves to a route where Red is not directly searching, he can expect to 


gain an additional z,, units of time if he survives the move 


OC: (41) 


Ue 
Similarly, if Blue moves to a route where Red is directly searching he can expect to gain 


0; units of time if he survives the move 


0, = 0,E, (42) 


youd 
Then by using (28) we can determine the value of each route. Using this we can 


iteratively find the expected survival time along each route by increasing m until we see a 
stable time appear while using equations (33) and (37). We should keep in mind that the 
computation of the expected survival time after leaving route i (V) is done as before in 
equation 28 with the expected times till indirect hazard being sorted in decreasing order 
(le. 7,27, 2...27,). Ifwe wanted to see what Red and Blue’s optimal strategies are in 
the absence of the search model, we can use equation 29. Note, however, that in this 
case x is Blue’s optimal strategy, since he is the row player and y’ is Red’s optimal 


strategy, since he is the column player. 


B. ANALYSIS OF AMBUSH MODEL 
1. Increasing Expected Survival Time on a Single Route 


As we saw in the analysis of the Search Model, restricting ourselves to a single 
route reduces the value of the game to that of the expected survival time associated with a 


direct hazard. Again, this value can never be zero but gets exponentially closer to zero 
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with increasing levels of effort (vy) on the part of Red to increase the probability of 
hazard. The challenge for Blue now, if he wishes to make it through the route 
successfully, is to ensure his expected survival time, E°, is greater than Red’s expected 
survival time, B°, along that route. He can do this by making sure Red’s probability of 


detection, p, is greater than his initial probability of a hazard, r, and that his efforts to 
increase the rate of detection, @, are greater than Red’s efforts to increase the linear rate 


of interference, 7. 


Of course, intuitively, this is what one expects. If the route is a highway with 
clear fields of view, and Blue actively sends recons up and down the route, then he is apt 
to survive longer along that route than Red. The converse is also intuitive. If a route 
goes through an area where Red can easily hide, and Red is very aggressive along that 
route, then Blue’s survival time will be lower than that of Red. Give the two options, 


Blue should always choose the former; thus leaving Red only the option of significantly 
increasing his efforts y to bring E° closer toB°. Given enough time though, Blue will 


be ambushed along that route. Blue can reduce this risk by adding routes to choose from. 


This is especially true if the routes are not significantly favorable to Blue. 


2. Increasing Expected Survival Time on Multiple Routes 


In this model, Blue’s only influence over his expected time of survival until 
encountering a hazard is to add more routes under consideration. As we see in equations 
(33) and (37) the expected survival time after moving goes up as we add more routes. 
However this also drives some of the times till departure, (35) and (39), to zero as some 
routes become more favorable. As with the Search Model, it becomes disadvantageous 
for Blue to add more routes, if they do not offer an advantage to routes already under 
consideration. There is also the practical matter of having only a finite number of 
possible routes to choose from in a realistic scenario. We will therefore limit our study to 


a few routes with the understanding that Blue has chosen from the best available to him. 
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IV. BIMATRIX MODEL 


A. OVERVIEW 


We have now developed two models. In each model, one player is trying to 
maximize his survival time while minimizing the other player’s survival time. Taking 
these two models we will define the payoff matrices in the following manner 


Aj t#j 
By i=j 


Y is the Search Payoff Matrix where y,, = { 


A is the Ambush Payoff Matrix where 4, = { s S (43) 


(note: From here on we'll use the transpose of A since we need Blue to play the 

columns of both matrices, but A was formed with Blue playing the rows) 

These payoff matrices form the basis for our bimatrix game. John Nash’s 
renowned paper on non-cooperative games [4] in 1951 proved that for every bimatrix 
game a pair of strategies exist that, if played by both players, maximize the value for 
both. At this equilibrium point, neither player can obtain a greater value by applying a 
different strategy while his opponent’s strategy remains unchanged. If both players 
change their strategies, then either party, or both, can obtain a greater value for their 


game.. 


Both Red and Blue can choose a single route (a pure strategy) or they can 
randomize their route choice by assigning a probability to the likelihood that they will use 
it (a mixed strategy). Let X be the set of all possible mixed strategies for Red and Y be 


the set of all possible mixed strategies for Blue. The expected survival time for Red is 
Epeq (Xs Y)=x' Vy for some xe X andyeYand likewise for Blue the expected 
survival time is E,,,.(x,y)=x' A’ y for some xe X andyeY. The Nash equilibrium 
is the pair of mixed strategies that maximize the survival time for both players. Letting 


x andy be our optimal strategies we can state it in the following way 


Ered , y) = x py" Zz xy: Vxex 
EnielX 9 =X TATY 2x” ATy Vy EY 
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Once we find x andy we define the value of the game for each player as 


Vacg =x Wy and V,,,,=x A’y . The player with the larger value has the advantage 


lue 


given the Nash equilibrium identified. Unfortunately, there is not always a single 


equilibrium point in their available strategies. 


1. Finding Pure Equilibrium Points 


Finding pure equilibrium points is relatively easy. Looking at Red’s payoff 
matrix, ‘¥, we choose the largest expected survival time in each column. For Blue, we 
look at the transpose of his payoff matrix, A’, and choose the largest value in each row. 
the locations (i,j) where these locations occur simultaneously are called the pure 
strategies where Red will use route 1 and Blue will use route j. As an example, take the 


following payoff matrices 


2 Sh 1 10 16 (18) 
w=/(6) 3 (6)| AT=|(22) 8 18 
4 1 (22) 16 12 


We have labeled using < > those column and row entries that are the largest for 
Red’s columns and Blue’s rows (respectively). From the example, we see that our 
equilibrium point is met when Blue chooses route 1 and Red chooses route 2. The 
advantage is clearly in Blue’s favor, as he is expected to survive longer than Red. This 
equilibrium should not be any surprise, as our payoff matrices are constructed in such a 
fashion that the pure strategy for either player will always be the route that provides the 
longest possible expected survival time, if the adversary is not on the route. Furthermore, 


since A,>B, andC, =D, the only possibility of Red or Blue choosing the same route 


(equilibrium on the diagonal) is in the event both indirect and direct detection/hazard 
times are equal. We can intuitively understand this result, however unlikely, as both Red 


and Blue have nothing to gain if they end up on the same route. 
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If Red or Blue wanted to alter the expected survival times, they could do so by 
spending resources (increasing their aggressiveness) to increase the rate of detection or 


hazards (a, # and v) thereby decreasing the value for their opponent. 


We must note that more than one pure strategy may appear using the following 
method. The strategy to choose would be the one that offers the maximum survival time 
to both players. If such a strategy does not exist, or multiple pure strategies occur with 
the same value, Red and Blue will need to determine their optimal mixed strategies. This 
is far more realistic, since Blue and Red will not always play a perfect game ensuring that 
they never pick the same route. Blue and Red will want to randomize their route 


selection as to not give an advantage to the other player in detecting them. 


2. Finding Mixed Equilibrium Points 


Finding all possible mixed Nash equilibria can be a daunting task. In 1964, 
Lemke and Howson [5] showed how to obtain all of the mixed equilibrium points in a 
two person game using non-linear programming. Their algorithm states that the 
strategies x andy are Nash equilibria, if and only if, they maximize the following non- 
linear equation and constraints [6]: 


max x’Vy+x’ Ay— p-q 
X,Y,P.d 
subject to: 


Yy < pJ, Ax<qJ, (where J, is anx1 vector of all ones) 
x, <0 and y, <0 Vie(1..n) (44) 


y yi = 3 x;=1 

i=l j=l 

(P' =Vread and g” =Veiye) 

Solving the above problem is best done using software. Barron [6] provides the 
Maple and Mathematica commands for setting up and solving such a problem. There are 
also multiple software packages available, such as SNOPT and KNITRO, which can be 


used for solving problems involving a large number of routes. For the examples given 


below with relatively few routes, we will rely on Maple’s NLPSolve command to find 
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multiple mixed Nash Equilibria. We accomplish this by altering the starting point for the 
non-linear program search by varying the values of p and q. This is no trivial task as the 


upper bound on number for equilibria is theorized to be 2” —1 in ann xn game [7]. 


In a perfect world, Blue and Red will agree to use a pure equilibrium, and they 
will never meet on a route. Both will choose the routes that provide them the greatest 
indirect detection time without being on the route together. This is unlikely as each will 
be inclined to use this predictability to their advantage and actively intercept the other. A 
better way to approach this problem is to determine the value of the individual games 
(search or ambush) and use these as our starting points for determining the mixed 
equilibrium strategies (1.e., p and g in 44). We can determine the expected payoffs of the 


individual games in the following way 


_ 47 * 
V search ~~ Mie searon YD aise Search 
(45) 


_ eT To 
V ccbiash = Xped—Ambush!* y Blue—Ambush 


These two values, while interesting, do not take into account the dynamics of both 
players being threatened while simultaneously threatening their opponent. What they do 
provide is a starting point when we apply non-linear programming to determine both 
players’ optimal strategies when faced with their competing self-interests of survival and 
attack. Depending on the risk either faces from moving, we may still encounter pure 
strategies where Red or Blue decide to use a single route rather than run the risk of 
changing routes even if they can shorten the expected survival time of the other. Clearly, 
if the risk of movement is not too great, a mixed strategy would best benefit both players 
as they can randomize their route selection. In the examples that follow, we will explore 


several variations of this game. 
B. EXAMPLES 


1. High Risk of Movement for Red and Uniform a 


In this example, we will examine the game where Blue may choose from six 


routes. The situation is such that Red faces a significant risk every time he decides to 
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move—less than 50% probability of successful completion of the move. Blue has 
relative freedom of movement—greater than 90% probability of successfully changing 


routes. The rate of increase in the probability of detection is uniform for all players 
(a= f=7=.0)). 


We list Red and Blue’s initial probabilities of detection in the following table, 
along with the variables assigned to them in Chapters II and III. 


p (Search Active) | q (Search Passive) | j (Ambush Active) | r (Ambush Passive) 


0.7 
0.4 
0.3 
0.2 

eS) 





pS 
a ae ae 
ae Se ae ee 
SS a a 
p92 _|__o1__ 


Table 1. | Example I—Probabilities of Ambush / Detection 


Our first step is to determine the expected time to capture / ambush assuming that 
neither Red or Blue are allowed to change routes (m =0). Doing so using the Search 


Model gives us 











m=0 1 2 3 4 5 6 
A’ 3.05 2.37 6.56 4.21 6.56 6.56 
B° 1.93 1.40 237 3.05 4.21 1.93 























Table 2. | Example I—Red’s Expected Survival Time with No Moves 


Using this, we can then start to determine the value of the routes and the time Red 
will stay on each route. To compute how much extra time Red expects to gain from 
moving from route i to route j, we need to define the risk he faces during the move. 
Using (26), (27), (41), (42) and Red and Blue’s probability of successfully changing 
routes, given respectively by the matrices P and Obelow, we determine that letting m = 4 
we reach stability in that the values for A, B, T, D, and V converge to within the first two 


decimal places. 
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Le aaa (aa (a Ue ae (a a 
po a | 0.452] 0.409] 0.452] 0.435/ 0.402] 
| 2 | o4s2] 0.452] 0.435] 0.452 0.435, 
| 3 | 0.409| 0.452] of 0.402| 0.435] 0.452] 


| 4 | 0.452| 0.435] 0.402| 0.452] 0.409 
| 5 | 0.435] 0.452] 0.435] 0.452] 0.452 
| 6 | 0.402] 0.435] 0.452] 0.409] 0.452] 


Table 3. | Example I—Red’s Probabilities for a Successful Moves 


| o {| 1 | 2 | 3 [| 4 {| 5 | 6 | 
Poa fT 98] 0.96] 0.94] 0.93] 0.92 
P| 2 | os} 0.98] 0.96] 0.94] 0.93 
| 3 | 0.96| 0.98] 0.98] 0.96] 0.94 





| 4 | 94] 0.96] 9s] 0.98] 0.96 
Ps | 093| 0.94] 0.96] 0.98] 0.98] 
| 6 | om] 0.93} 0.94} 0.96] 0.98] 


Table 4. | Example I—Blue’s Probabilities for a Successful Move 





The values of A, T, B, D, and V all correspond to the equations given in Chapter II. 









































m=4 1 2 3 4 5 6 
At 3.05 2.45 6.56 4.21 6.56 6.56 
T; 13.83 1.04 34.97 23.08 40.76 33.76 
B 2.28 2.44 2.38 3.05 4.21 2.29 
D; 0 0 4.97 13.08 30.76 0 
V; 2.28 2.44 2.22 2.32 1.97 2.29 





Table 5. | Example I—Red’s Expected Survival Time with Multiple Moves 


For this example, the optimal strategies for Red and Blue (in the absence of the 


Ambush model) are 














1 2 3 4 5 6 
x. 0 0 0.2661 0 0.4734 0.2605 
Red-—Search 
y, 0 0 0.2661 0 0.4734 0.2605 
Blue—Search 























Table 6. | Example I—Optimal Search Strategies 
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Red can expect the following survival time using this strategy: 

_ &T * = 
V sina = Xped- eres) 4 Blue-Search =>-446. Note that Red and Blue share the same route 
selection strategies in this example. This will not always be the case. 

We need to now find the expected survival times for Blue using the Ambush 
model. As with the Search model, we start by determining the initial expected survival 


times along each route assuming that Blue is not allowed to change routes once one is 


selected (m= 0). 



































m=0 1 2 3 4 5 6 
ce 3.13 3.13 2.55 4.13 2.55 3.13 
E° 1.82 1.39 2.13 1.82 1.58 2.13 

Table 7. | Example I—Blue’s Expected Survival Time with No Moves 


Since Blue faces less risk moving from cell to cell, the expected times of survival 
converge much slower than for Red. By m = 35 we get convergence in the first two 


decimal places. 


Table 8 shows the values of C, M, E, N, and V as given in Chapter III. 









































m=35 1 2 3 4 5 6 
on 4.78 4.80 4.57 4.79 4.53 4.77 
us 1.25 1.24 0.19 1.24 0.21 1.25 
EX 4.45 4.48 4.56 4.47 4.52 4.44 
N35 0 0 0 0 0 0 
ys 4.45 4.48 4.56 4.47 4.52 4.44 





Table 8. | Example I—Blue’s Expected Survival Time with Multiple Moves 


It should not surprise us that all N are zero. With so little risk to Blue’s 
movements, he will change routes immediately, if he finds himself on the same route as 
Red. Keep in mind that in this game Blue is playing the rows, and Red is playing the 


columns of our matrix A, however, for the sake of consistency, we will continue to define 


Dy 


Blue’s optimal strategy as y and Red’s as x. The optimal mixed strategy for each player 


in the absence of the Search model thus becomes 














1 2 3 4 5 6 
x 0.2519 0.2519 0 0.2519 0 0.2443 
Red—Ambush 
y, 0.2125 0.3063 0 0.2750 0 0.2062 
Blue—Ambush 























Table 9. | Example I—Optimal Ambush Strategies 


The value of the Ambush model is V 


ese * _ 
Ambush ~~ y Blue~Ambush!\“Red—Ambush = 4.702 ; 


and this also gives us the time Blue can expect to survive without an ambush using the 


available routes. We note here that V,_,., <V. and given these results alone, we 


Search? 
expect an ambush to occur before Red is discovered. In this event, Blue may want to find 
additional routes or find a way to increase the rate of indirect detection in those cells still 


relevant for Red to use. 
Using the Search Model, we can vary the values for a and £ to see the effect of 


each. The graphs below show the change in V search , aS we vary these parameters while 


keeping the other one constant. 





Figure 1. a. varies while B=.01 
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Figure 2. B Varies while o=.01 (Same Graph — Different scales for horizontal axis) 


Clearly, Blue’s indirect efforts to detect Red (a) reduce Red’s expected survival 
time (Vsearen) With greater efficiency than Blue’s direct efforts (B). As expected, as both 
direct and indirect efforts increase they show a decreasing return on their ability to reduce 
Red’s expected survival time. If Blue finds himself at a disadvantage, (i.e., his expected 
survival time is less than Red’s) this analysis can help the commander decide where to 
place his efforts (direct or indirect) to get the best reduction in Red’s expected survival 


time. 


Taking both games into consideration, we obtain the following bi-matrix 


consisting of Y and A’; each cell contains a value from each individual payoff matrix 
g pay 


oa a), Gs aT a/R SDI ATH 
| 2 | (2.45, 4.78) (2.44, 4.48) | (2.45, 4.57)| (2.45, 4.80)] (2.45, 4.53)| (2.45, 4.77)| 


TE EL. naa AGL. 
4 | (4.21, 4.78)|(4.21, 4.80) | (4.21, 4.57) | (3.05, 4.48)| (4.21, 4.53)|(4.21, 4.77) 
(6.56, 4.80) (6.56, 4.80) 
| 6 =| (6.56, 4.78) CRORES) (6.56, 4.80) 





Table 10. | Example I—Bimatrix Model 


We immediately note that there are multiple pure Nash Equilibrium points in the 


above matrix. The points that yield the maximum survival time for both are outlined in 
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black above. If Blue and Red were purely rational players, they would avoid each other 
completely. Blue would choose either route 2 or 4 and Red would choose 3, 5 or 6. Blue 
would then obtain a value of 4.80 and Red a value of 6.56 for the bimatrix game. Clearly 
Red has the advantage in this scenario. To find the mixed Nash Equilibrium points we 


resort to non-linear optimization. 


Why do we need to find mixed equilibrium, if we already have several pure 
strategies from which to choose? If Red and Blue collaborate to ensure they both avoid 
capture/ambush for the longest possible time, then pure strategies are the answer. 
However, both parties can be less concerned about their own survival time and more 
concerned with reducing the survival time of their adversary. We call these different 
strategies as risk adverse (wishing to maximize one’s own survival time) and risk prone 
(disregarding self-preservation in an effort to reduce the others). By adjusting the initial 
point in our non-linear programming, we arrive at different optimal mixed strategies. In 
the tables below, we declare our starting point as (p, q) where p is Red’s payoff from the 
Search game and g is Blue’s payoff from the Ambush game, refer to equation (44). 


We assume that each player wishes to find the optimal strategy that gets them as 
close to the value of their individual games as possible. This means we conduct our non- 
linear optimization from the initial value of (5.446, 4.702). Using a software package (in 


this case MAPLE’s NLPSolve command) we obtain the following mixed strategies: 














(5.446,4.702)| 1 2 3 4 5 6 
x, 0 0 0.283 0 0.717 0 
Yr 0 1 0 0 0 0 























Table 11. |. Example I—Optimal Bimatrix, Risk Adverse Strategies 


Our players are now using strategies that are consistent with the pure strategies 
previously noted (and only Red is using a mixed strategy). Note that they still do 
not choose routes that intersect with their adversary. Using these mixed strategies, 


we can determine the value of the game for each player as: 


V, 


Red 


=x Wy" = 6.56 and V,,,, =x Ay’ =4.80. This should not be surprising as the 
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pure strategies all lead to the same values of the game for each player. In practical 
applications, any mixed strategy involving the pure strategies will yield the same result. 
Over the course of time, each player is benefited by randomizing their choices so a mixed 
strategy is preferable over a pure strategy, especially when they lead to the same values 


for the game. 


What if each player was so aggressive that he was not concerned with maximizing 


his own survival time? By setting our initial point to (0,0) we get the following 











strategies: 
(0,0) 1 2 3 4 5 6 
Xi; 0 0 0.283 0 0.717 0 
Vis 0 .643 0 357 0 0 























Table 12. |. Example I—Optimal Bimatrix, Risk Prone Strategies 


Blue has now adopted a mixed strategy and Red’s strategy has not changed. The 
values of the games are unchanged at: V,,, =x Wy’ =6.56 and V,,,, =x Ay’ =4.80. 


Clearly there are multiple strategies for Red and Blue that lead to the same values. In 
reality, a route that is conducive to a successful ambush (choke points with lots of cover) 
is also conducive to hiding. Likewise, a route that is favorable for a convoy (wide open 
spaces) is not favorable for the enemy seeking to avoid detection. Therefore, it should 


not be surprising that Red and Blue seek out different routes. 


Blue can use this information to his advantage by choosing his route strategy 


among the mixed strategies among routes 2 and 4 that lead to V,,,, = 4.80 while avoiding 


lue 
routes 3 and 5. Blue also has the benefit of learning the mixed strategy Red will adopt in 
the Nash Equilibrium. It is important to note that if Blue was to use this information to 
send a separate patrol to find Red using this strategy, it would violate the assumptions set 
forth at the beginning of this paper. The game would then become one of 3 players (vs. 


2). Next, we will see what happens when Red is allowed to move with less risk. 


D1 





2. Low Risk of Movement for Red and Uniform a 


In this example, we will set the probabilities of detection/hazard for Red and Blue 
(indirect and direct) very close to each other. This will drive them to consider the same 
routes. We will also make it less likely that Red will be intercepted in transit. As in 
Example I with Blue, both players will have a 90% or greater probability of successfully 
changing routes. The linear rate of increase in the probability of detection/hazard is 


uniform for all players (@ = B=y=.01). 


Red and Blue’s initial probabilities of detection are given in the following table 


along with the variables assigned to them in Chapters II and III. 


Route p (Search Active) q(Search Passive) j (Ambush Active) —r (Ambush Passive) 
1 0.5 0.3 0.4 0.2 
2 0.5 0.4 0.6 0.1 
3 0.4 0.2 0.4 0.2 
4 0.3 0.2 0.3 0.1 
5 0.3 0.1 0.2 0.1 
6 0.2 0.1 0.3 0.1 
Table 13. Example II—Probabilities of Ambush / Detection 


As before, our first step is to determine the expected time to capture / ambush 


assuming that neither Red or Blue are allowed to change routes (m =0). Doing so using 


the Search Model gives us 
































m=0 1 2 3 4 5 6 
A? 3.05 2.37 63.56 4.21 6.56 6.56 
BY 1.93 1.40 2.37 3.05 4.21 1.93 

Table 14. _ Example II—Red’s Expected Survival Time with No Moves 


Once again, we go through the process outlined in Example 1 to determine the 
final values for each route. Both Blue and Red face little risk in moving as given inTable 
15. Our equations (26), (27), (41) and (42) converge much slower, and we must calculate 
larger values of m before reaching a stable solution. As with Blue in the first example, 


we will need m = 35 to get convergence to the first two decimal places. 
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| ots] 0.96] 0.94] 0.93] —0.9 
| 09s} of .98] 0.96] 0.94] 0.93 
| 0.96| 0.98] 0.98] 0.96] 0.94 


| 0.94|0.96| 0.98] 0.98] 0.96 
| 0.93] 0.94] 0.96, 0.98] 0.98 
| 0.92] 0.93] 0.94] 0.96] 0.98] 


Table 15. | Example II—Red and Blue’s Probabilities for a Successful Move 





The values of A, T, B, D, and V are found with the equations given in Chapter II. 

















m=35 1 2 3 4 5 6 
AS 6.86 6.96 7.49 71 7.55 7.52 
Ts 0 0 4.51 0 4.29 4.34 
B® 6.86 6.96 6.89 71 7.00 6.95 
D® 0 0 0 0 0 0 
ys 6.86 6.96 6.89 7.13 7.00 6.95 























Table 16. | Example II—Red’s Expected Survival Time with Multiple Moves 


For this example, the optimal strategies for Red and Blue (in the absence of the 
Ambush model) are 











1 2 3 4 5 6 
a 0 0 0.3206 0 0.3459 0.3335 
y, 0 0 0.2673 0 0.3992 0.3335 
Blue—Search 























Table 17. | Example II—Optimal Search Strategies 


Red can expect the following survival time using this _ strategy: 


ees * = 
Search XRed—Search ¥ ) Biue-Search = 7.330. 
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Doing the same for Blue, we obtain the following. 











m=0 1 2 3 4 5 6 
Cc 4.21 6.56 4.21 6.56 6.56 6.56 
EP Zt 1.62 ZW 3.05 4.21 2.03 




















Table 18. | Example II—Blue’s Expected Survival Time with No Moves 


Table 19 shows the values of C, M, E, N, and V as given in Chapter III. 






































m=35 1 2 3 4 5 6 
Cc» 7.24 7.70 7.43 7.80 7.83 7.80 
M* 0 3.82 0 3.54 3.45 3.54 
E» 7.24 7.24 7.43 7.39 7.44 7.38 
Ne 0 0 0 0 0 0 
v> 7.24 7.24 7.43 7.39 7.44 7.38 








Table 19. | Example II—Blue’s Expected Survival Time with Multiple Moves 


As with Blue in Example 1, all D and N are zero, since there is little risk to Red or 
Blue to move, if they find themselves in the same cell as their adversary. We will 
continue to define Blue’s optimal strategy as y and Red’s as x. The optimal mixed 


strategy for each player considering only the Ambush model thus becomes 














1 2 3 4 5 6 
x 0 0.0426 0 0.2862 0.3872 0.2841 
Red—Ambush 
y, 0 0.2253 0 0.2545 0.2659 0.2543 
Blue—Ambush 




















Table 20. | Example II—Optimal Ambush Strategies 


The value of the Ambush model is V4, = Yerue-Ampuoh!“red_Ambush = 76776 


and this also gives us the time Blue can expect to survive without an ambush using the 
available routes. Unlike our previous exampleV, ..,,,, >Vsearch> leading us to believe 


that Blue has a slight advantage in this scenario and can expect to live longer. 
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As before, we obtain the following bi-matrix consisting of Y and A’; each cell 


contains a value from each individual payoff matrix (y, ). 


LR GR Ge 


| 4 | (7.13, 7.24) | (7.13, 7.70) |(7.13, 7.43) | (7.13,7.39) | (7.13, 7.83) [(7.13, 7.79) 
CERES CEERAD) 
| 6 | (7.52, 7.24) | (7.52, 7.70) | (7.52, 7.43) | (7.52, 7.79) CEES) 





Table 21. Example II—Bimatrix Model 


The pure Nash Equilibria are highlighted in Table 21. Interestingly, both Red and 
Blue would prefer to use Route 5 as it provides them the greatest value if their adversary 
is not along that route. To avoid confrontation, which would decrease their survival time, 
Red and Blue would benefit from using the route pairs (5,4), (6,5), or (5,6) to maximize 
their survival times. As mentioned previously, a pure strategy is not viable over time as it 
provides the adversary a clearer picture of where to find you. We see through our non- 
linear programming that a mixed strategy provides Red and Blue both with greater values 
for their individual games though at a risk that they might take the same route at the same 


time. 


Using non-linear optimization we will start our search for mixed Nash Equilibria 


with the values of the individual games V. =7.330and V, ns; = 7-6776. This 


Search 


produces the following optimal strategies: 














(7.330,7.678) 1 2 3 4 5 6 
ae 0 0 0.8974 0 0.1026 0 
y, 0 0 0 0.8383 0.1091 .0526 
Blue 























Table 22. Example II—Optimal Bimatrix, Risk Adverse Strategies 


ee) 





Using these mixed strategies; Red and Blue obtain the following values for the 


game: V,., =x Wy =7.49 and V,,, =x Ay’ =7.79. Note how their mixed strategies 


lue 


give them a greater payoff than the pure strategies. This comes from an interesting 


choice for how they randomly choose the route they will take. 


From Table 21, Red’s route preference (from highest value to lowest) should be 
5-6-3-4-2-1. Blue’s preference should be 5-4-6-1-2-3 where 4 and 6 could be 
interchanged, since they have the same value. Interestingly, the mixed strategies in Table 
22 show that Red will avoid 6, even though it is his second highest valued route, to avoid 
Blue. The values for Route 5 are great enough that both Red and Blue will risk taking 
route 5 approximately 10% of the time to increase their overall value leading them to 


increase their overall values from the pure strategies. 


As in Example 1, we set our initial value at (0,0) to represent a more aggressive 
game where each player wishes to obtain a strategy that brings their opponent’s values to 


zero. Doing so produces the following optimal strategies: 














(0,0) 1 2 3 4 5 6 

5 0 0 0.8974 0 0.1026 0 

y, 0 0 0 0.8383 0.1091 .0526 
Blue 




















Table 23. | Example II—Optimal Bimatrix, Risk Prone Strategies 


Clearly, there is no change in optimal strategies from our previous initial starting 
point and, therefore, the values of the game for Red and Blue go unchanged. In fact, by 
varying our initial point we can see that this mixed strategy is relatively stable. Each 
player can be assured of the outcome regardless of the aggressiveness of their adversary. 
As before, Blue can then use this information to route his convoy using y knowing which 


routes he is most likely to encounter Red on. 
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V. TOPICS FOR FURTHER RESEARCH 


A. APPLICATION 


Practical application of this method is reliant on determining the initial 
probabilities of indirect and direct detection for each route under consideration. While 
this may be impossible to do with any certainty, approximations based on the relative risk 
each route presents can provide a starting point. This model also offers the ability to 
determine the effect, if any; indirect detection methods (rewards, humanitarian efforts, 
etc.) have on the enemy’s strategy for route selection in the face of trying to engage a 
convoy. Also, by assigning cost to both direct and indirect measures, the commander can 


best determine which investment returns the greatest survival times for his convoy. 


B. POSSIBLE FOLLOW-ON RESEARCH 


This model provides only a starting point in exploring the relationships between 
the indirect and direct aggressiveness each player exhibits in trying to minimize their 
opponent’s survival time while maximizing their own. This model could easily be 
adopted for cities where Blue wishes to dissuade enemy activity through both direct and 
indirect means. In this scenario, an optimal control problem is clearly present. What is 
the balance of direct and indirect aggressiveness that minimizes the time Red stays in the 
city that also allows Blue to maximize resources? Another avenue of research is the 
relationship between Blue’s aggressiveness (a,B) and Red’s aggressiveness (y). This can 
be applied to the classic problem faced by law enforcement. As each player gets 
increasingly aggressive in their direct attempts to eliminate their opponent, they are 
greeted with increasing direct aggressiveness from their opponent. A indirect aggressive 
approach may be more appropriate and this model provides for exploring that option. 
Finally, another possible research path is to apply this model to a network of routes where 
flow analysis can be combined with the expected survival times along each route to 
determine the best overall route to take. In conclusion, this model can be readily adopted 
for a myriad of problems where two parties have conflicting goals and two methods of 


achieving those goals. 
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